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We propose a new sensitivity analysis methodology for complex stochastic dynamics based on the Relative 
Entropy Rate. The method becomes computationally feasible at the stationary regime of the process and 
involves the calculation of suitable observables in path space for the Relative Entropy Rate and the cor- 
responding Fisher Information Matrix. The stationary regime is crucial for stochastic dynamics and here 
allows us to address the sensitivity analysis of complex systems, including examples of processes with com- 
plex landscapes that exhibit metastability, non-reversible systems from a statistical mechanics perspective, 
and high-dimensional, spatially distributed models. All these systems exhibit, typically non-gaussian station- 
ary probability distributions, while in the case of high-dimensionality, histograms are impossible to construct 
directly. Our proposed methods bypass these challenges relying on the direct Monte Carlo simulation of rigor- 
ously derived observables for the Relative Entropy Rate and Fisher Information in path space rather than on 
the stationary probability distribution itself. We demonstrate the capabilities of the proposed methodology 
by focusing here on two classes of problems: (a) Langevin particle systems with either reversible (gradient) or 
non-reversible (non-gradient) forcing, highlighting the ability of the method to carry out sensitivity analysis in 
non-equilibrium systems; and, (b) spatially extended Kinetic Monte Carlo models, showing that the method 
can handle high-dimensional problems. 

Keywords: Sensitivity analysis. Relative entropy rate. Fisher information matrix, kinetic Monte Carlo, Markov 
processes, Langevin equations 



I. INTRODUCTION 

In this paper we propose the Relative Entropy Rate 
as a sensitivity analysis tool for complex stochastic dy- 
namics, based on information theory and non-equilibrium 
statistical mechanics methods. These calculations be- 
come computationally feasible at the stationary process 
regime and involve the calculation of suitable observables 
in path space for the Relative Entropy Rate and the cor- 
responding Fisher Information Matrix. The stationary 
regime, i.e. stochastic dynamics where the initial proba- 
bility distribution is the stationary distribution reached 
after long-time integration, is especially crucial for com- 
plex systems: it includes dynamic transitions between 
metastable states in complex, high-dimensional energy 
landscapes, intermittency, as well as Non Equilibrium 
Steady States (NESS) for non-reversible systems, while 
at this regime we also construct phase diagrams for com- 
plex systems. Hence their sensitivity analysis is a cru- 
cial question in determining which parameter directions 
are the most/least sensitive to perturbations, uncertainty 
or errors resulting from parameter estimation. Recently 
there has been significant progress in developing sensi- 
tivity analysis tools for low-dimensional stochastic pro- 
cesses at the transient regime, such as well-mixed chem- 
ical reactions. Some of the mathematical tools included 
discrete derivatives^, Girsanov transformations^''^, poly- 
nomial chaos^, and coupling of stochastic processed. 

On the other hand, it is often the case that we are in- 
terested in the entire probability density function (PDF), 
which in nonlinear and/or discrete systems is typically 
non-Gaussian, and not only in a few moments, due to 



the significance of rare/tail events. For example, it was 
recently shown that in catalytic reactions the most ki- 
netically relevant configurations are occurring rarely, and 
correspond to overlapping tails of (non-Gaussian) PDFs^. 
In that latter direction, there is a broad recent literature 
relying on information theory tools, where sensitivity is 
estimated by using the Relative Entropy and the Fisher 
Information between PDFs, see for instance^"— . In par- 
ticular, such methods were introduced for the study of 
the sensitivity of PDFs to parameters in climate models—; 
there the PDFs structure is known as it is obtained 
through an entropy maximization subject to constraints. 
Knowing the form of the PDF allows to carry out cal- 
culations such as obtaining a Fisher Information Ma- 
trix (FIM), which in turn identifies the most sensitive 
parameter directions. On the other hand, the sensitiv- 
ity of stochastic dynamics can be studied by using the 
FIM-l" . There the authors are employing a linearization 
of the stochastic evolution around the nonlinear mean 
field equation and as a result the form of the PDF is again 
known, and more precisely it is Gaussian hence the FIM 
can be directly computed. Although there are regimes 
where this approximation is applicable (short times, sys- 
tems with a single steady state, etc.), for systems with 
nontrivial long-time dynamics, e.g. metastable, it is not 
correct as large deviation arguments^ show, or even 
explicitly available formulas for escape timesi^. Simi- 
lar issues with non-gaussianity in the long time dynam- 
ics arise in stochastic systems with strongly intermittent 
behavior—. 

Some of these challenges will be addressed through 
the proposed methods which we present next in the con- 
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text of kinetic Monte Carlo models although similar chal- 
lenges and ideas are relevant to all other stochastic molec- 
ular simulation methods. For example, we discuss in 
Section |VlC the sensitivity of algorithms for the numeri- 
cal integration of Langevin dynamics. Moreover, kinetic 
Monte Carlo methods involving surface chemistry are 
formulated in terms of continuous time Markov chains 
(jump processes) on a spatial lattice domain Ajv: at 
each lattice site a; G A^v there is a state space E = 
{0, 1, . . . , K} describing different chemical species (inter- 
acting particles), where the simplest case K — 1 repre- 
sents the well-known lattice-gas model-"'^-. The process at 
is defined as a continuous time Markov Chain (CTMC) 
on the (high-dimensional) state space Sjy = and 
mathematically it is defined completely by specifying the 
local transition rates c^{a,(j') where 6 G M*"' is a vector 
of the model parameters. The transition rates determine 
the updates (jumps) from any current state ctj = cr to a 
(random) new state a' and concrete examples of spatial 
physicochemical models are considered in Section fVlD. 
From the local transition rates one defines the total rate 
A^(ct) = c^{a,a'), which is the intensity of the ex- 
ponential waiting time for a jump from the state cr. The 

transition probabilities are (a, a') = ^^^j^- The ba- 
sic simulation tool for these lattice jump processes is ki- 
netic Monte Carlo (KMC) with a wide range of applica- 
tions from crystal growth, to catalysis, to biology, see for 
instance^^. 

Relative Entropy Rate : In simulations of dynamic transi- 
tions between metastable states on high-dimensional en- 
ergy landscapes or of NESS we are interested in the sen- 
sitivity of stationary processes, i.e., processes for which 
the initial probability distribution is the stationary one 
(reached after long-time integration). Mathematically, 
we want to assess the sensitivity of the CTMC {crt}t>o 
with local transition rates c^{a,a') to a perturbation 
e G R'^ in the parameter vector 9 giving rise to a pro- 
cess {d't}t>o with local transition rates c^^'^((T, cr'), when 
the initial data are sampled from the respective station- 
ary probability distribution. The error analysis in the 
context of the long-time behavior is developed in terms 
of the relative entropy, 
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[0,T] I Q[0,T] 
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(1) 



[0,T] , 



where Qj^g j^j (resp. Q^^^t]) path space probability 

measures of {crf}t>o (resp. {(Jt}t>a) in the time inter- 
val [0,T]. In the case these probability measures have 
corresponding probability densities and q^^'^, (HJ be- 
comes 7^ {q\q^t] I Q[otT]) = S^f log (^) ■ A key prop- 
erty of the relative entropy 7^ (P | Q) is that 7^ (P | Q) > 
with equality if and only if P = Q, which allows us to 
view relative entropy as a "distance" (more precisely a 
semi-metric) between two probability measures P and 
Q. Moreover, from an information theory perspective^i. 



the relative entropy measures loss/change of information, 
e.g. in our context for the process {crt}(>o associated 
with the parameter vector 9, with respect to the pro- 
cess {5t\t>o associated with the parameter vector 6 -\- e. 
Relative entropy for high-dimensional systems was used 
as measure of loss of information in coarse-grainingi^— , 
and sensitivity analysis for climate modeling problems^. 

Starting from ([1]), by Girsanov's formula we obtain 
an explicit expression for the corresponding Radon- 
Nikodym derivative 



dQl 



r^(W)^exp{Eiog3^ 



A''(o-,_)p*(a,_,a,) 



((Ji,-)p*+"(crs-,crs) 



(2) 



on any path of the process {o't}t£[o,T] in terms of the 
jump rates and transition probabilities of both process, 
under suitable non-degeneracy conditions^i. Notice that 
as- denotes the left-hand limit of as at a jump instance 
s. Following calculations regarding the related quan- 
tity of entropy production in non-equilibrium statistical 
mechanics^^, we can show that when the initial distri- 
bution ao ^ /i^ where /i^ (resp. /i^^*^) is the station- 
ary probability disturbution of {at}t>o (resp. {at}t>o), 
then the relative entropy formula simplifies dramatically 
in two parts, one pure equilibrium (scaling as 0(1)) and 
one capturing the stationary dynamics (scaling as 0{T)): 
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where TZ {fi^ \ z^^"*"') is the relative entropy between the 
stationary probabilities, while 



n (Qfo.T] 1 Qlt.T]) = V A'^(<T)p''(a,a') 



X log 
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(4) 



where E^^e denotes the expected value with respect to 
the probability fj,^ . In ([3]), we immediately notice that 
a most relevant quantity to describe the change of infor- 
mation content upon perturbation of model parameters 
of a stochastic process is the 0{T) term, which can be 
thought as a relative entropy per unit time while on the 
other hand, the term TZ (/i^ | /i^^*^) becomes unimportant 
as T grows. 

We will refer from now on to the quantity Q as the 
Relative Entropy Rate (RER), which can be thought as 
the change in information per unit time. Notice that 
RER has the correct time scaling since it is actually in- 
dependent of the interval [0,r]. Furthermore, (|4]) pro- 
vides a computable observable that can be sampled from 
the steady state fj.^ in terms of conventional KMC, by- 
passing the need for a histogram or an explicit formula 
for the high-dimensional probabilities involved in ([T|) . Fi- 
nally, the fact that in stationary regimes, when T ^ I 
in ([3]), the term 7?, (/i^ | becomes unimportant, is 
especially convenient: /i^ and /i^^'^ are typically not 
known explicitly in non-reversible systems, for instance 



2 



in spatially distributed reaction KMC or non-reversible 
Langevin dynamics considered here as examples. 

Fisher Information Matrix on Path Space : An attractive 
approach to sensitivity analysis that is rigorously based 
on relative entropy calculations is the Fisher Information 
Matrix. Indeed, assuming smoothness in the parameter 
vector, it is straightforward to obtain the expansion of 



{Q[o,t] I Q 



[0,T] 



= ie^F7j(Qfo,T])e + 0(|en, (5) 



where the Fisher Information Matrix (FIM) is defined as 
the Hessian of the relative entropy: 



T] I Q[(hT] 



(6) 



As ^ readily suggests, relative entropy is locally a 
quadratic function of the parameter vector 9. Thus spec- 
tral analysis of FTj-provided the matrix is available- 
would allow us to identify which parameter directions are 
the most/least sensitive to perturbations, uncertainty or 
errors resulting from parameter estimation. The source 
of such uncertainties could be related to the assimila- 
tion of experimental datai2^ or finer scale numerical sim- 
ulation, e.g. Density Functional Theory computations 
in the case of molecular simulations^^. More specifi- 
cally, the knowledge of the Fisher Information Matrix 
not only provides a gradient-free method for sensitivity 
analysis, but allows to address questions of parameter 
identifiability^^'^® and optimal experiment design^li^. 
However, the FIM F-ji in (|6]) is not accessible compu- 
tationally in general, nevertheless analytic calculations 
can be performed at equilibrium (e.g., in ergodic sys- 
tems when T — >■ oo) under the assumption or the explicit 
knowledge of the stationary distribution fi. An example 
of such a calculation is under the assumption of a Gaus- 
sian distribution with the mean m{9) and the covariance 
matrix T,(0) in which case the matrix Fn is computed in 
terms of derivatives of the mean and covariance matrix—. 

On the other hand ^ provides a different perspective 
to these issues, giving rise to a computable observable for 
the path space Fisher Information Matrix that includes 
transition rates rather than just the stationary PDFs. 
Indeed, by combining ^ and ([5]) we obtain the following 
expansion for the dominant, 0{T) term in ([3]): 



n 



{Q[o,t] I Q 



[0,T] 
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e^Fn{Qto,T])e + 0{\e\^), (7) 



where the Fisher Information Matrix per unit time, 
F-uiQ^Q rp-^), has the explicit form 



xWg\ogc\a,a')\7g\ogc\a,aY , 



(8) 



where c^{a, a') — \^{(T)p^{a, a'). Fisher Information Ma- 
trices given by © and ([5]) are straightforwardly related 



through lim-r^oo y Ffj = Fh- It is clear from ([8]) that the 
Fisher Information Matrix, just like the Relative Entropy 
Rate ([4]) , is merely an observable that can be sampled us- 
ing KMC algorithms. 

The previous discussion suggests that the proposed ap- 
proach to sensitivity analysis is expected to have the fol- 
lowing features: 

1. It is rigorously valid for the sensitivity of long-time, 
stationary dynamics in path space, including for 
example metastable dynamics in a complex land- 
scape. 

2. It is a gradient- free sensitivity analysis method 
which does not require the knowledge of the equi- 
librium PDFs, as ^ is replaced with a computable 
observable , that contains explicitly information 
for local dynamics. 

3. It is suitable for non-equilibrium systems from 
a statistical mechanics perspective; for example, 
non-reversible processes, such as spatially extended 
reaction-diffusion Kinetic Monte Carlo, where the 
structure of the equilibrium PDF is unknown and 
is typically non-Gaussian. 

4. A key enabling tool for implementing the proposed 
methodology in high-dimensional stochastic sys- 
tems is molecular simulation methods such as KMC 
or Langevin solvers which can sample the observ- 
ables (jlj and ([8|) , and in particular their accelerated 
or scalable versionsi^^"— . 

Indeed, we demonstrate these features by present- 
ing three examples addressing different points: (a) 
the well-mixed bistable reaction system known as the 
Schlogl model which also serves as a benchmark; (b) a 
Langevin particle system with either reversible or non- 
reversible forcing, that demonstrates the ability of the 
proposed method to carry out sensitivity analysis in 
non-equilibrium systems; and, (c) a spatially extended 
KMC model for CO oxidation known as the Ziff-Gulari- 
Barshad (ZGB) model. Such reaction-diffusion models 
are typically non-reversible, hence the sensitivity tools we 
propose here are highly suitable. Regarding this last class 
of problems, we note that in more accurate, state-of-the- 
art KMC models with a large number of parameters^"—, 
kinetic parameters are estimated through density func- 
tional theory (DFT) calculations, hence sensitivity anal- 
ysis is a crucial step in determining the parameters that 
need to be calculated with greater accuracy. 

The paper is organized as follows: in Section [H] we 
present the derivation of the Relative Entropy Rate and 
its corresponding Fisher Information Matrix for discrete- 
time Markov chains while Section IIIII the same observ- 
ables for continuous-time Markov processes (i.e., ([3]), (|4]) 
and (|S])) are derived. Section IIVI generalizes the RER 
and the FIM to time-periodic, inhomogeneous Markov 
processes as well as to semi-Markov processes. Statistical 
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estimators and numerical examples in Section |V] demon- 
strate the efficiency of the proposed sensitivity method, 
while Section IVTl concludes the paper. 



Proof. The path space relative entropy equals to 



II. DISCRETE TIME MARKOV CHAINS 

Let {(Jm}m£Z,+ be a discrete-time time- homogeneous 
Markov chain with separable state space E. The transi- 
tion probability kernel of the Markov chain denoted by 
P^{a,da') depends on the parameter vector 6* £ M'''. As- 
sume that the transition kernel is absolute continuous 
with respect to (w.r.t.) the Lebesgue measure^^ and the 
transition probability density function {a, a') is always 
positive for all a, a' e E and for all 6* e R*^. We further 
assume that {<Jm\m£Z+ has a unique stationary probabil- 
ity distribution denoted by 1/ (cr) . Exploiting the Markov 
property, the path probability distribution Qq for the 
path {(Tm}m=o time horizon 0, A/ starting from 

the stationary distribution /i^((To) is given by 

Ql Af (c^o, • • ^ctm) = ^i%(To)p'^{aa,ai) ■ ■ ■ p'^ {(tm-i,(^m) ■ 

(9) 

We consider the perturbation by e G M'^ and the Markov 
chain {cTm}m£Z+ with the respective transition probabil- 
ity density function, p^^'^{a, cr'), the respective stationary 
density, ^^"•"'^(tr), as well as the respective path distribu- 
tion Qq^/. Then, the Radon-Nikodym derivative of the 
unperturbed path distribution w.r.t. 
distribution takes the form 



the perturbed path 



({^™}) 



, (10) 



which is well-defined since the transition probabilities are 
assumed always positive. 

The following Proposition demonstrates the relative 
entropy representation of the path distribution Qq j^j 

w.r.t. the path distribution Qq^^. 

Proposition II. 1. Under the previous assumptions, 
the path space relative entropy TZ ^Qq \ Qq ^ 



/log 



dQg equals to 



{Qo,m I Qo, 
where 



M 



Mn 



(11) 



^ {Qo,AI I QqM ] — ^tJ. 



/ p {a,a)\og-— -da 

(12 

is the relative entropy rate. 
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Using the relations 

/ p{a,a')da' — 1 & / fi{a)p{a,a')da — fJ.{a') 

J E J E 

the relative entropy is simplified to 

{Qo,m I Qt)lM^ = 1^ P-\cro) log -j^J^^dao 

+ y2 [ [ m'^(Q'0/(Q'm log fl+^T" daidai+i 

JeJe P^+'lo-i.o-i+i) 

= M-H [q'o,m I Qo%) + 7^ (m' I m'""') 



7^ 



□ 



For large times (M ^ 1), the significant term of the 
relative entropy, TZ (^Qq^m I Qo^f) i is the relative entropy 

rate, V. ^Qq j^j \ Qg^^^ , which scales linearly with the 

number of jumps of the Markov chain while the relative 
entropy between the stationary probability distributions, 
7^ (//^ I /i^"*"') , becomes unimportant. Thus, at the sta- 
tionary regime, the appropriate observable for sensitivity 
analysis is the relative entropy rate. Furthermore, the 
RER expression (IT^ incorporates the transition proba- 
bilities of the Markov chain which are typically known 
-for instance, whenever a path sample is needed to be 
generated- while the respective stationary probability 
distributions are typically unknown -for instance, in non- 
reversible systems- and should be computed numerically, 
if possible. Moreover, the path-space RER takes into 
consideration the dynamical aspects of the process while 
the relative entropy between the stationary distributions 
does not take into account any dynamical aspects of the 
process which are critical in metastable or intermittent 
regimes. 

Fisher Information Matrix for Relative entropy rate : 
The relative entropy rate is locally a quadratic functional 
in a neighborhood of 6*. The curvature of the RER around 
6*, defined by its Hessian, is called the Fisher Informa- 
tion Matrix which is formally derived as follows. Let 
5p{(T,a') := p^~^'^{a,a') — p^{a,a'), then the relative en- 
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tropy rate of Qq w.r.t. Qq~^]^j is written as 



^{a)p (a,a)log (^1 + j 

|^/i*(o-)5p(CT, O-') 

Moreover, for all a ^ E/ii holds that 

5p{a,a')da' = / p^^'^ (a, a')da' — p^{a,a')da' 
J E J E 



dada' 



= 1-1 = 

while a under smoothness assumption on the transition 
probability function for the parameter 9, which is an eas- 
ily checkable assumption, a Taylor series expansion is ap- 
plicable to Sp: 

Sp{a,a')^e'^Vep\<J,a') + 0{\e\') 

Thus, we finally obtain that 

0,M I QojVf') 



E J E P ("j" ) 



1 T 



m" (o-)p^ (o-, 0-) Vfl log p" [a, a') 



X Velogp®(cr,crydcrd(7')e + 0(|e|^) 



where 
F-h('3o,j\/) := 



p*((j, (T)Ve logp*((T, (t')Vs logp^(o-, a')^ da' 



(13) 



is the path space Fisher Information Matrix (FIM) for 
the relative entropy rate. Notice that FIM as well as 
RER are computed from the transition probabilities un- 
der mild ergodic average assumptions (see also Section Ivl 
where explicit numerical formulas are provided). 

Remark II. 1. The Fisher information Matrix for 
'^{qV^m\Qom^ agam F-h{Qo m) while the relative 
entropy rates are related for small e through 

n (q^+aJ I Qo.m) = W (Qo.m I Qo!m) + o(kl') 

\ ^ , 
= n[Qf,,M\Qo:M)+o{\ef). 

Remark II. 2. // the transition probability function of 
the Markov chain equals to p^{a^a') = H^{(t') for all 
a,cr' e E and for all 9 G M'"', which is equivalent to 
the fact that the samples are independent, identically dis- 
tributed from the stationary probability distribution, then 
the relative entropy rate between the path probabilities be- 
comes the usual relative entropy between the stationary 



distributions and the path space FIM becomes the usual 
FIM. Indeed, FIM is simplified to 

Fh{Qo,m) 

= / / p" {(^) {(^' '^og^^ {a )V0 log {a' da da' 

J E J E 

= / ^i''((T')Vs log^''((j')Ve log^''((T')^dcr' 
Je 

=■■ F7j(/) 



while we similarly obtain for the relative entropy rate that 
oe I pe+ - - 

otl^ot 



H(Po'',|Por^) = ^(/l/+^)- 



III. CONTINUOUS-TIME MARKOV 
CHAINS 

As in the case of Kinetic Monte Carlo methods, we con- 
sider {crt}igR+ to be a CTMC with countable state space 
E. The parameter dependent transition rates denoted 
by c^{a, a') completely define the jump Markov process. 
The transition rates determine the updates (jumps or so- 
journ times) from a current state a to a new (random) 
state fj' through the total rate X^{a) :— J^a'eE ''"') 
which is the intensity of the exponential waiting time for 
a jump from state a. The transition probabilities for the 

embedded Markov chain {Jn}^-^^ are p^ {a, a') = ^e'^^j^ 
while the generator of the jump Markov process is an 
operator acting on the bounded functions (also called 
observables) /(a) defined on the state space E and fully 
determines the process: 



Cf{a)^ c'{a,a')[fia')-f{a)] 



(15) 



a'eE 



Assume that a new jump Markov process {crt}tGR+ is 
defined by perturbing the transition rates by a small vec- 
tor e € R and that the two path probabilities 

and Q^g*^] are absolute continuous with respect to each 
other which is satisfied when c^{cr,a') = if and only if 
c''+'(cr,cr') = holds for all a,a' G E. Then the Radon- 
Nikodym derivative of the path distribution Q^^ with 

respect to the path distribution Q^pt^] has a explicit for- 
mula known also as Girsanov formula3ii^ 



d-Q[o,T],( -,^ M*('^o) 



dQ 



[0,T] 



\og 4k^^dN. 



{as~,as) 



[\''{a,)~X'+^as)] ds 



(16) 

where fi^ (reps, n^'^'^) is the stationary distributions of 
Wt}tm+ (resp. {(Tt}teM+) while Ns is the counting (of 
the jumps) measure. Having the Girsanov formula, the 
relative entropy is easily derived as the next Proposition 
reveals. 
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Proposition III.l. Under the previous assumptions, 
the path space relative entropy TZ ( Q^q^^j | Q^q j^j j equals 



to 



\QlQ,T] I Q[O^T] 

where 

n (Q[o,t] I Q 



™(Qfo,^]|Qf+^J)+7^(M''|/+') , 

(17) 



S + E 

[0,T] 



(18) 



is t/ie relative entropy rate. 

Proof. The explicit formula for the RER was first given 
by Dumitrescu^^ for finite state space, though, we repro- 
duce the proof for the sake of completeness. Using the 
Girsanov formula, the relative entropy (jl7p is rewritten 
as 



n (gfo.T] I Q 



9 + e 
[0,T] 
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Jo 



[A"(a,) - A*+'(a.)] ds 



'10, T] 



'[0,T] 



log ■ 



,61+E 



[A''(a,) - A''+'(a,)] ds 



[0,T] 



log 



/^"(o-o) 



Exploiting the fact that the process Mt := Nt 
Jq {as~)d s is a martingale, we have that 



log ■ 



■dNs 



'[O.T] 



.0, M C {Os 



■ da 



Moreover, changing the order of the integrals and due 
to the stationarity of the process {(Tt}tGK+, the relative 
entropy is simplified to the following: 



ds 



^ ( Q[0,T] j Q[0,T] 



A {a)p (a, a) log — 



Jo 



(a, a') 



log 



/(a) 



M»+^(a) 



□ 



Fisher Information Matrix: Even though not directly 
evident, relative entropy rate for the jump Markov pro- 
cesses is locally a quadratic function of the parame- 
ter vector 9 G M.''. Hence, Fisher Information Matrix 
which is defined as the Hessian of the RER can be de- 
rived. Indeed, defining the rate difference 5c{a, a') = 



c^+'^(cr, cr') — c^{(T,a'), the relative entropy rate of Qt 



w.r.t. Qj^gt^] equals to 



^ {Q[o.t] I Qjotr]) 



[0,T] 



- Yl /('^)c*(cr,0-')log f 1 + 
5Z m''(o")5c((J,(j') 



Sc(a, a') 
c''{a, ct') 



cT, o-'eE 

1 s, s5c(cr,a')^ , 3 

= 2 2^ ^ cHo a') +0(M'^,'^)\ ) 



(19) 



Under a smoothness assumption on the transition rates 
in a neighborhood of parameter vector 9, which is also 
a checkable hypothesis, a Taylor series expansion of 
Sc{a, a') = e'^Vgc'^ia, a') + 0(|ep) results in 



(^Q[0,T] I Q[tXT] ) 

1 0, , (e^VflC»(a,a')) 



+ o(|.r) 



c«(a,a') 

^ /(^)c'^('^,'^')Velogc''(a,a') 



(20) 



X Ve logc*(a,a')^)e + 0(|e|^) 
ie^F«(gfo,T])£ + 0(kl') 



where 

Fn{Q[o.T]) ■= 
E 



Y c" {a, a' )'Vo log c" {a, a' )\7g log c {a, a' f 



(21) 



is the path space Fisher information matrix of a jump 
Markov process. It is based on the transition rates of the 
process which are typically known — they actually define 
the process — thus FIM as well as RER are numerically 
computable under mild ergodicity assumptions. Further- 
more, it is noteworthy that the only difference between 
the FIM of the Markov chains in the previous Section 
and the FIM of the continuous-time jump Markov pro- 
cesses is that in the latter the transition rates c^(cr, cr') are 
employed instead of the transition probabilities p^(cr, a'). 



IV. FURTHER GENERALIZATIONS 

The two previous Sections cover the cases of time- 
homogeneous Markov chains and pure jump Markov pro- 
cesses. The key observable for the parameter sensitivity 
evaluation is the Relative Entropy Rate which is the time 
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average of the path space relative entropy as time goes 
to infinity: 



n 



(Q[o. 



T] I Q[0X 



lim 

T->-oo 



T] 



\Q 



9+e 

[0,T] 



(22) 



Additionally, RER has an explicit formula in both cases 
making it computationally tractable as we practically 
demonstrate in Section|Vl Thus, if there are more general 
stochastic processes which also have an explicit formula 
for the RER, Fisher Information Matrix can be defined 
analogously and gradient-free sensitivity analysis is also 
doable. Next, we present two families of stochastic pro- 
cesses which have known RER. 

Time-periodic Markov Processes : Such Markov processes 
are typically utilized to describe circular physical or bi- 
ological phenomena such as annual climate models or 
daily behavior of mammals. Even though more general 
classes of processes can be presented, we restrict to the 
discrete-time Markov chains with finite state space E. 
The time-inhomogeneous transition probability matrix is 
denoted by p{a,a';m) and the periodicity implies that 
p{a,a';m) — p{a,u' ]kC, + m), Vfc 6 Z+ where C is the 
period. Assume that for all m = 0, C ^ 1 the process 
{crfeC+m}fe°=o which is a Markov chain has a unique sta- 
tionary distribution ^{x,m). Then the Markov process 
{^™}mez+ steady state regime is periodically station- 
ary with periodic stationary distribution /x. 

In terms of sensitivity analysis, the relative entropy 
rate between the path probabilities has the explicit for- 
mula 

C-i 



m— o-,a' £E 

e, I , P*(o-,o-';"i) 



X p (ct, ct'; rri) log 



P®+'(ct, a'\ m) 



(23) 



> > p (cr, (T ; m) log . ; r 

m=Oo-'e-B ^ V 1 1 ; 



Similar to the previous cases, a generalized formula for 
the path-space FIM can be derived. It is given by 



Fh(<3o,a/) := T E E m''('^,C)p''(o-,o-';"i) 

m=Oo-,o-'e-B (24) 

X Ve logp''(CT, a'; m)Vg logp''(cr, ct'; rnf . 

Existence of the relative entropy rate for general time- 
inhomogeneous Markov chains can also be found^S.. 

Semi-Markov Processes : These processes generalize the 
jump Markov processes as well as the renewal processes 
to the case where the future evolution (i.e., waiting times 
and transition probabilities) depends on the present state 
and on the time elapsed since the last transition. Semi- 
Markov processes have been extensively used to describe 
reliability models^, modeling earthquakes^, queuing 
theory^, etc. In order to define a semi-Markov process 
the definition of a semi-Markov transition kernel as well 
as its corresponding renewal process is required. Let E 



be a countable state space then the process { J„, Sn}n£Z+ 
is a renewal Markov process with semi-Markov transition 
kernel q{a,a';t) cj,a' e E, t e M+ if 



Pji/n + l ~ O"', S„ + \ — Sn < t\Jn ~ O", Jo , Sn, So)} 

= P{( Jn+1 = cr', S„+i - Sn < t\J„ ^ a} ~ q{a, a';t) . 

(25) 

The process J„ is a Markov chain with transition proba- 
bility matrix elements p{a, a') — lim(_>.oo ^(o', a' , t) while 
the process Sn is the sequence of jump times. Let 
Nt, t eR+ defined by Nt = sup{7i > : Sn < t he the 
counting process of the jumps in the interval (0, t\. Then 
the stochastic process Zt, t ^ R+ defined by Zt = 
for i > (or J„ = Z{Sn) for n > 0) is the semi-Markov 
process associated with (J„,S'„). 

Assume further that the (embedded) Markov chain J„ 
has a stationary distribution denoted by fi as well that 
the mean sojourn time with respect to the stationary dis- 
tribution defined by m := X^o- o-'efi /^(''^) Io° li'^^'^''^^) 
finite. Then it was shown in*'^ that the relative entropy 
rate of the semi-Markov process Zt with model param- 
eter vector 9 w.r.t. the semi-Markov process Zt with 
parameter vector 6 + e is given by 



H 



(Qio.t] 1 ' 



-[0,T] 



W°° V- »r ^ »r ' ^1 9!(^Xi£) . (26) 



while the Fisher information matrix is similarly defined 
as 



FhQIo.t] ~ ^ E m''(o-)i7''(o",o"'; s 

-^0 <.,a'eB (27) 

X Vs log q^{a, a'; s) Vs log q" (cr, a'; s)^ ds 



V. NUMERICAL EXAMPLES 

We demonstrate the wide applicability of the proposed 
methods by studying the parameter sensitivity analysis 
of three models with very different features and range of 
applicability. Namely, we discuss the Schlogl model, re- 
versible and irreversible Langevin processes and the spa- 
tially extended ZGB model. Each of these models reveals 
different aspects of the proposed method. However, we 
will first need to discuss the necessary statistical estima- 
tors for the Relative Entropy Rate and the Fisher Infor- 
mation Matrix. 



A. Statistical Estimators for RER and FIM 



The Relative Entropy Rate (|T2]), ^8} as well as the 
Fisher Information Matrix p^ . (I?!]) are observables of 
the stochastic process and can be estimated as ergodic 
averages. Thus, both observables are computationally 
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tractable since they also depend only on the local tran- 
sition quantities. We discuss each case separately next. 

Discrete-time Markov Chains : A statistical estimator for 
Markov Chains is directly obtained from (fT2|). For in- 
stance, in the continuous state space case, the n-sample 
numerical RER is given by 



1 ""-^ r 



(28) 



while the n-sample statistical estimator for FIM is 
1 f 

p(n) ^ _ / P*(o-i,cr')Velogp*((Ti,(T')Vslogp®(cri,cr')^do-', 

(29) 

where {o'i}"^g is a realization of the Markov chain with 
parameter vector 9 at steady (stationary) state. Thus the 
RER for various different perturbation directions (i.e., 
different e's) is computed from a single run since only the 
unperturbed process is needed to be simulated. However, 
the integrals in (1^51) and are rarely explicitly com- 
putable thus a second statistical estimator for both RER 
and FIM is obtained from the Radon-Nikodym derivative 
pO| in the path space. It is given by 



while the second estimator for FIM is 



^ n — 1 

71 ^ ^ 



(30) 



(31) 



Even though, the second approach is tractable for any 
transition probability function, it suffers from larger vari- 
ance (see also Fig. [T]), since the summation over all the 
possible states in psp results in estimators with less vari- 
ance compared to the variance of estimator (|30|) . Hence, 
the first numerical estimator is preferred whenever ap- 
plicable (for instance, when the state space is finite and 
relatively small). Finally, the estimators are valid also 
for time inhomogeneous Markov chain where {(Ji, Ci+i) 
is replaced by p^(crj, cTj+i; i). 

Continuous-time Markov Chains : The estimators for 
CTMC are constructed along the same lines. Indeed, 
the first estimator for RER is based on ([T8|) and it is 
given by 



n-l 



T 

i=0 



(32) 



where Ar^ is an exponential random variable with pa- 
rameter A(cri) while T = '^^^Ti is the total simulation 
time. The sequence {(Ti}"=o ^^e embedded Markov 
chain with transition probabilities p^{<7i,a') = ^ x{a^ 
at step i. Notice that the weight Ar^ at each step which 
is the waiting time at state <Ti is necessary for the correct 



estimation of the observable^. Similarly, the estimator 
for the FIM is 



a'eE 



(33) 

Notice that the computation of the local transition rates 
c^{ai,a') for all ct' G £^ is needed for the simulation of 
the jump Markov process when Monte Carlo methods 
such as stochastic simulation algorithm (SSA)'*'' is uti- 
lized. Thus, the computation of the perturbed transi- 
tion rates is the only additional computational cost of 
this numerical approximation. On the other hand, the 
second numerical estimator for RER is based on the Gir- 
sanov representation of the Radon-Nikodym derivative 
(i.e., p^ ) and it is given by 



^ n — 1 

- E ■ 



.9 + E 



(o"i,(Ti + l) T 



i£Ar,(A*(aO-A*+'(aO) 



(34) 

Similarly we can construct an FIM estimator. Notice 
that the term in (1341) involving logarithms should not 
be weighted since the counting measure is approximated 
with this estimator. Unfortunately, the estimator (|34|) 
has the same computational cost as ([5^ due to the need 
for the computation of the total rate which is the sum 
of the local transition rates. Furthermore, in terms of 
variance, the latter estimator has worse performance due 
to the discarded sum over the states a' . 

Finally, we complete this section with a proposition 
that states that all the proposed estimators are unbiased. 

Proposition V.l. Under the assumptions of Proposi- 
tion \II.1\ for Markov chains or of Proposition \III.1\ for 
jump Markov processes, the numerical estimators I128\} - 
^34^ are unbiased. 



Proof. The proofs for each estimator are similar and they 
are more or less hidden in the proofs of Propositions III. 1 1 
and nil. 11 Nevertheless, we provide the proof for the 
estimator (|30|) for the sake of completeness. We have 
that 



n — 1 



p^{ai,ai+i) 
p«+^(CT,,cri+l) 



X ^^((To)p^((To,cri) • • •p^(cr„_i,a„)dCTo • ■ • da„ 

1 f f P^(0-i,CTi+l) g g \J J 



□ 



= n{Q'\Q'+^) 

which completes the proof. 
B. Schlogl Model 



The Schlogl model describes a well-mixed chemical re- 
action network between three species A, B, X— The 
concentrations A, B are kept constant while the reaction 
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rates fci, fc4 are the parameters of the modeL Table |T] 
provides the propensity functions (rates) for these reac- 
tions where is the volume of the system Note that 
serves as a normalization for the reaction rates making 
them of the same order. Thus, there is no need to re- 
sort in logarithmic sensitivity analysis even though this 
is possible (see Appendix lA)) . 



TABLE I. The rate of the feth event when the number of X 
molecules is x is denoted by Ck(x). is the volume of the 
system. 



TABLE IL Parameter values for the Schlogl model. 



Event 


Reaction 


Rate 


1 


A + 2X 


ci{x) = kiAx{x - l)/(2fi) 


2 


3X A + 2X 


C2(x) = k2x{x - l){x - 2)/(6!r2^) 


3 


B ^ X 


C3{x) = k'iB^ 


4 


X ^ B 


C4,(x) = k^x 



The stochastic process describing the number of X 
molecules of the Schlogl model is a CTMC with rates pro- 
vided in Table m Since the Schlogl model is a birth/death 
process, the exact stationary distribution /i(a;), can be it- 
eratively computed from the reaction rates utilizing the 
detailed balance condition"'^. It states that 



fj,{x)c{x, X -I- 1) = ^i{x + l)c{x + 1, x) 



(35) 



where c{x,x + l) = ci{x) + c:i{x) is the birth rate at state 
X while c{x,x — 1) = C2{x) + C4{x) is the death rate of 
the same state. Having the exact stationary distribution 
a simple benchmark for the sensitivity of the system is 
provided. Furthermore for the parameter values in Ta- 
ble nil the stationary distribution of the Schlogl model 
possesses two most probable constant steady states (see 
also Fig. [21 solid lines). Thus, the stochastic process 
is non-Gaussian and Gaussian approximationsii are in- 
valid, at least at long times where transitions between 
the most likely states take place, see (see Figs. [1] and [3]). 
Capturing these transitions is a crucial element for the 
correct calculation of stationary dynamics and the effi- 
cient sampling of the stationary distribution. Notice also 
that there are studies on sensitivity analysi s "'^'^^ where 
the Schlogl model with volume $7 = 100 has been used 
for benchmarking, however, for this value of the most 
likely states in Fig. |3l are steep and the simulation algo- 
rithm is trapped, depending on the initial data, into the 
one of the two corresponding wells. Thus, for deep wells 
it takes an exponentially long time to make a transition 
from one to the other well, consequently, the sensitivity 
analysis is biased and depends on the initial value of the 
process. In fact, in the case of deep wells the Gaussian ap- 
proximation is correct and the FIM analysis^^ applies as 
long as the process remains trapped. In a intuitive sense, 
the volume can be thought as the inverse temperature 
of the system making the stationary distribution more or 
less steepi^. 

Let denote 9 = [fciA, ^2, fca^B, ^4]^, then the numeri- 
cal estimator for RER as well as for FIM for the Schlogl 



Parameters 


Q 


kiA 


k2 


ksB 


k4 


Values 


15 


3 


1 


2 


3.5 





1000 1500 
Time 



0.04 
" 0.03 



Estimator (32) 

Estimator (34) 

exact 



1000 1500 
Time 



FIG. 1. Upper plot: The number of X molecules as a func- 
tion of time. The stochastic process sequentially visits the 
two most probable states defined as the maxima of the PDF. 
Lower panel: RER as a function of time when kiA is per- 
turbed by 0.05 computed using (|32[) (dashed line) and using 
P4p (grey line). In both cases, the accuracy of the numerical 
estimators increase as the number of samples increases. 



model is given by ([32]) and ([33)) . respectively. The upper 
panel of Fig. [H shows the number of X molecules in the 
course of time. The number of jumps of this process are 
10^ while the initial value Xq = 100 is slightly above the 
minimum of the second well. The lower panel of Fig. [Tl 
shows the numerical RER (dashed line) as a function of 
time when only kiA is perturbed by 0.05 (i.e., perturba- 
tion is e = 0.05ei) as well as the exact RER computed 
from (jlSp . For comparison purposes, we also plot the 
RER estimator (|34p . Obviously, as simulation time is in- 
creased both numerical RER estimators converge to the 
exact value even though the estimator (|34p needs more 
samples to converge (i.e., its variance is larger). Notice 
that enough transitions between the two steady states are 
necessary in order to obtain robust results. Fig. [51 depicts 
the exact RER (circles), the numerically-computed RER 
(stars) as well the FIM-based RER (squares) . The direc- 
tions ieoCfe, k = 1, ...,4 where eg is set to 0.05 while 
are the typical orthonormal unit vectors are considered. 
These directions correspond to the perturbation of just 
one of the model's parameters. The number of jumps of 
this simulation is 5 • 10^ while the initial value is again 
Xq — 100. The numerically-computed RERs have sim- 
ilar values with the exact ones as Fig. [2l demonstrates. 
The computed RERs imply that the most sensitive pa- 
rameter is fc2 (corresponds to ±62) while the least sen- 
sitive parameter is k^B (corresponds to ±63). Another 
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important feature of the proposed sensitivity method is 
that the RERs for all the different parameter perturba- 
tions are computed from a single simulation run of the 
unperturbed process. Thus, for each direction, the only 
additional computational cost is the calculation of the 
perturbed rates of the process. Notice also that RER 
gives different values between a direction and its opposite 
resulting in assigning different sensitivities while FIM- 
based RER cannot distinguish between the two opposite 
directions since it is a second-order (quadratic) approxi- 
mation. 



parameter. 



0.08 

0.07 

2 0.06 

I 0.05 

I 0.04 

.| 0.03 

^ 0.02$ 

0.01 
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FIG. 2. Exact (circles), numerical (stars) and FIM-based 
(squares) RER for various directions. k2 is the most sensitive 
parameter followed by kiA while the least sensitive parame- 
ters are k4 and k^B. 



Reference 
- - - Most sensitive param. 
Least sensitive param. 




80 100 120 
Number of X molecules 



FIG. 3. Upper plot: The stationary distributions for the un- 
perturbed process (solid line), the most sensitive parameter 
k2 (dashed line) and the least sensitive paramter kgB (dotted 
line). Lower plot: The stationary distributions for the unper- 
turbed process (solid line), the most sensitive parameter k2 
(dashed line) and the most sensitive direction tmax (dotted 
line). 



We further validate the inference capabilities of RER 
by illustrating the actual stationary distribution of the 
perturbed processes. It is expected that the most/least 
sensitive parameters of the path distribution should be 
strongly related with the most /least sensitive parameters 
of the stationary distribution. Indeed, the upper panel 
of Fig. [3] presents the stationary distributions of the un- 
perturbed process (solid line) as well the perturbed sta- 
tionary distribution of the most (dashed line) and least 
(dotted line) sensitive parameters. The perturbation of 
the most sensitive parameter results in the largest change 
of the stationary distribution while the smallest change 
is observed when the least sensitive parameter is per- 
turbed. Moreover, FIM can be used for the computation 
not only of the most sensitive parameter but also for 
the computation of the most sensitive direction in gen- 
eral. Indeed, the most sensitive direction can be found 
by performing eigenvalue analysis to the FIM. The eigen- 
vector with the highest eigenvalue defines the most sen- 
sitive direction. In our setup, the most sensitive direc- 
tion is Emax = [0,0.978,0,0.207]. The prominent param- 
eter of the most sensitive direction is /c2 which is not a 
surprise since, from Fig. [2l fc2 is the most sensitive pa- 
rameter. The lower panel of Fig. [3] depicts the station- 
ary distribution of the most sensitive parameter (i.e., ^2 
or —£062) (dashed line) and the most sensitive direction 
(i.e., eoEmax) (dotted line). It is evident that the station- 
ary distribution of the most sensitive direction is further 
away from the unperturbed stationary distribution com- 
pared to the stationary distribution of the most sensitive 



C. Reversible and non- reversible Langevin 
Processes 

The second example we consider is a particle model 
with interactions which have been applied and studied 
primarily in molecular dynamics^^-^ but also in biol- 
ogy (for instance, in swarming^), etc. In molecular dy- 
namics, the Langevin dynamics is typically a Hamilto- 
nian system coupled with a thermostat (i.e., noise). A 
Langevin process is defined by the SDE system 

dqt = —ptdt 
m 

dpt = -F{qt)dt - ^ptdt + adBt (^6) 



where G M is the position vector of the N particles 
in d dimensions, pt S M'*^ is the momentum vector of 
the particles, m is the mass of the particles, F is a driv- 
ing force, 7 is the friction factor, a is the diffusion fac- 
tor and Bt is a dA'^-dimensional Brownian motion. The 
first equation which describes the evolution of the po- 
sition of the particles is deterministic thus the overall 
SDE system is degenerate. In the zero-mass limit or the 
infinite- friction limit, Langevin process is simplified to 
overdamped Langevin process which is non-degenerate, 
however, several studies advocate the use of Langevin 
dynamics directl y^^'^^ . The proposed sensitivity analysis 
approach is widely applicable to SDE systems once the 
assumption on ergodicity is satisfied. 
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The vector field F(-) denotes the force exerted on the 
system and here we assume it consists of two terms: a 
gradient (potential) component as in typical Langevin 
systems, as well as an additional non-gradient term, 
where the latter is assumed to be divergence-free: 

F(q) = V,y(<z) + aG(q) , (37) 

and Vq • G — 0. Here we consider particular examples 
to illustrate the applicability of the proposed sensitiv- 
ity analysis methods. The gradient term in (|37|) models 
particle interactions given by 

i,j<i 

where VMir) is the three-parameter Morse potential 
VMir) = Deil - e-''('-''<=))2. The Morse potential in- 
cludes a combination of short-range repulsive and long- 
range attractive interactions and has been extensively 
used in molecular simulations^. The divergence-free 
component is taken to be a simple antisymmetric force 
given by 

G^{q) = q^+l - q^-l , i = l,...,N (39) 

where qq = qn and qn+i — <?!■ 

We now return to p7p and discuss the implications 
of its structure. When a = 0, the Langevin process is 
reversible meaning that the condition of detailed bal- 
ance is satisfied with respect to a known Gibbs sta- 
tionary probability distribution^^. However, knowing 
the stationary distribution explicitly is insufficient to 
carry out sensitivity analysis on the stationary dynamics 
which typically may include dynamic transitions between 
metastable states, as in the Schlogl Model discussed ear- 
lier. Furthermore, when a ^ 0, detailed balance does 
not hold true in general and the stationary probability 
distribution of the corresponding Langevin process is not 
known since the system is non-reversible^^iS. Examples 
of forces such as ([57)) that include non-gradient terms and 
yield non-reversible Langevin equations, arise typically 
in driven systems, for instance in Brownian particle sus- 
pensions where particles interact with a fluid flow^*. For 
non-reversible systems no efficient method for sensitiv- 
ity analysis has been reported in the literature, at least 
for the cases dealt here, namely (a) long-time, station- 
ary dynamics (also referred to as non-equilibrium steady 
states (NESS)^=iS,)j as well as, (b) the unknown station- 
ary probability. Our proposed path-space RER sensitiv- 
ity methods can address these challenges and is straight- 
forwardly applicable to both reversible and non-reversible 
Langevin equations as we show next. 

First, an explicit EM-Verlet (symplectic)-implicit EM 
scheme is applied for the discretization of p6p . It is writ- 



ten as 

p,+ i ^Pr- ^(gO^ - m^'T ^ '^^^^ 

qi+i ^ qi+m'^p-.iAt 

+ ^ 40 
At J At 
P.+i = P,+ 1 - F(q,+i)— - —P^+l^ + '^AW^»+i 

with AWi, AW^^i - iV(0, ^IcIn) where N is the muhi- 
variate normal distribution. This numerical scheme also 
known as BBK integrato r^^i^^ utilizes a Strang splitting. 
Thus, the discretized Langevin process is a Markov chain 
with continuous state space. Notice that the numerical 
scheme is non-degenerate, thus, the transition probabil- 
ity from state (q,p) to state {p',q') is given by 

Piq,P,q\p') = Piq'\q,p)P{p'\q\q,p) (41) 

where 

P{q'\q,p) = ±e-OTk'-?+(f-F(9)A^+p4l?)Atr (42) 
and 

P{p'\q',q,p) = ^e-^|(l+^^)'''-(S(9'-«)-^F(9'))l' 
Zi 

(43) 

where Zi, i — 0,1 are the respective normalization 
constants. Let now define the parameter vector 9 = 
[De,a,re]- Then, the discretized Langevin model (j40|) 
is a discrete-time Markov process with R^*^^ being the 
state space. The statistical estimators for RER as well 
as for FIM are given by (150]) and ([5T|) . respectively. No- 
tice that the estimators with larger variance were chosen 
because the integration of the transition probability den- 
sity function w.r.t. the positions is not a trivial problem, 
if not intractable in high dimensions. 

TABLE III. Parameter values for the discretized Langevin 
system. 



Parameters 


N 




a 


re 


m 


7 


a 


At 


Values 


3 


0.3 


0.3 


1 


1 


1 


0.1 


0.01 



The upper panel of Fig. |4] depicts the numerical RER 
as a function of simulation time for the parameter val- 
ues given in Table IIIIl The reversible case is considered 
while the sensitivity of the parameters is obtained from 
the directions defined by the orthonormal unit vectors 
multiplied with eo — 0.05. Since the initial positions and 
momenta where randomly chosen from a uniform distri- 
bution an initial out-of-equilibrium time regime can be 
seen in the Figure (up to time to = 100). Moreover, the 
variance of RER as an observable is rather large which 
can be explained by the small number of particles. Sys- 
tems with more particles are expected to converge faster 
due to averaging effects. The lower panel of Fig. |4]depicts 
the RER at final time t ~ IC* with an initial equilibra- 
tion time to = 100 where the numerical RER is discarded. 
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Evidently, the most sensitive parameter is a followed by 
Z?e while the least sensitive parameter is Ve- 
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FIG. 4. Upper plot: Relative entropy rate as a function of 
time for perturbations of De (solid line), a (dashed line) and 
of Te (grey line) at the reversible regime (a = 0). The variance 
of the numerical RER is large, necessitating more samples for 
accurate estimation. Lower plot: RER for various directions. 
The most sensitive parameter is a. 



Utilizing our methodology the parameter sensitivity 
of not only the reversible regime but also of the non- 
reversible, a 7^ 0, regime can be explored even though 
the stationary probability is not known. Fig. [5] shows the 
level sets of the FIM matrix for the reversible case (upper 
plots, a = 0) and for the irreversible case (lower plots, 
a = 0.1). Figure suggests that the additional irreversible 
component results in the fact that some directions be- 
came more sensitive and some other directions became 
less sensitive. Further validation is obtained from the 
eigenvalues of the FIM which are 7.30, 0.592, 0.015 for the 
reversible case while the eigenvalues for the irreversible 
case are 13.90,0.302,0.074. Finally, FIM can be very 
useful in various ways for the quantification of sensitiv- 
ity analysis. For instance, the determinant of FIM which 
in optimal experiment design is called A-optimality can 
be used as a measure of parameter identificationiii^^i^l. 
In our particular example, the determinants are 0.065 
and 0.313 for the reversible and irreversible cases, respec- 
tively. This result asserts that in the non-reversible case 
a 7^ in (|37l) . the divergence-free component improves 
the ability of any estimator of the potential's parameters. 



D. Spatially extended Kinetic Monte Carlo 
models 

The applicability of the proposed sensitivity method 
is further demonstrated in spatially extended systems 
which exhibit complex spatio-temporal morphologies 
such as islands, spirals, rings, etc. at mesoscale length 
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FIG. 5. Upper plots: Level sets (or neutral spaces) for the 
reversible case [a = Q). Lower plots: Level sets for the irre- 
versible case [a = 0.1). 



TABLE IV. The rate of the fcth event of the jth site given that 
the current configuration is a is denoted by Ck{j; cr) where n.n. 
stands for nearest neighbors. 



Event 


Reaction 


Rate 


1 


^ CO 




2 


^ O2 




3 


CO + O^ CO2 + des. 




4 


+ CO ^ CO2 + des. 





scales. Among the various surface mechanisms such as 
adsorption, desorption, diffusion, etc. we focus on CO 
oxidation which is a prototypical example for molecular- 
level reaction-diffusion mechanism between adsorbates 
on a catalytic surface. A simplified CO oxidiza- 
tion model without diffusion known as the Ziff-Gulari- 
Barshad (ZGB) model^*^ is considered. Despite being an 
idealized model, the ZGB model incorporates the basic 
mechanisms for the dynamics of adsorbate species during 
CO oxidation on catalytic surfaces, namely, single site 
updates (adsorption/desorption) and multisite reactions 
(two neighboring sties being involved). Due to the reac- 
tions between species, the ZGB model is non-reversible 
and its stationary distribution is unknown. Nevertheless, 
our sensitivity analysis methodology is capable of quan- 
tify the parameter sensitivities utilizing only the rates of 
the process which are provided in Table IIVI The spins 
of the two dimensional lattice A^r with N lattice sites 
take values = denoting a vacant site j G A^r, 

— —1 for a CO molecule at site j and a{j) = 1 for 
an O molecule. Depending on the local configuration of 
site j as well as of the nearest neighbors, the events with 
the respective rates provided in Table HVl are executed. 
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The ZGB model is a high-dimensional CTMC which 
here is simulated utilizing the stochastic simulation 
algorithnt^. For each step of the simulation, the rates of 
the process for all sites of the lattice are needed. Inter- 
estingly, in order to perform our sensitivity analysis to 
the system parameters, only the rates are incorporated. 
Indeed, denoting hy 9 = [ki,k2\ the parameter vector, 
then the statistical estimators of RER as well as of FIM 
for the ZGB model are given by ((32|) and ([33| . respec- 
tively. Nevertheless, we explicitly provide the numerical 
RER estimator for convenience: 



(44) 



where c^{j;a) is the fcth event of lattice site j 
when the lattice configuration is cr while A^((t) — 

SjGAjv ^k=i "^fcO' ''') total rate of the process at 

state CT. 

The upper panel of Fig. |6] depicts the RER as a func- 
tion of simulation time when ki — 0.35 is perturbed by 
eo = 0.02 (solid line) and when ^2 = 0.85 is perturbed by 
the same amount. It is evident that after an initial burn- 
ing time, RER converges fast to a limit value implying 
that the variance of RER as an observable is small. This 
can be explained by the fact that at each step of the sim- 
ulation, averages the over the entire lattice in order 
to compute the instantaneous RER. The lower panel of 
Fig. ini depicts the RER at final time t = 100 with an ini- 
tial equilibration time to — 10 where the instantaneous 
RER is discarded. Obviously, the most sensitive parame- 
ter is ki which is related with the adsorption mechanism 
while the least sensitive is ^2- In order to further vali- 
date our findings, we plot the lattice configuration when 
either fci or /c2 is perturbed by eo = 0.02. Fig. [7] depicts 
the configuration of the unperturbed system as well as 
the configurations when one of the two model param- 
eters are perturbed. Evidently, the configuration when 
the most sensitive parameter (i.e., ki) is perturbed is less 
similar to the unperturbed configuration compared to the 
configuration when the least sensitive parameter (i.e., ^2) 
is perturbed. 

Thus far, we have performed local sensitivity analysis 
meaning that we were concentrated around a single point 
of the parameter space. Even though various global sen- 
sitivity analysis approaches have been derived based on 
variance^i^ or on mutual information^, here, we present 
a demonstration of global sensitivity analysis based on 
a phase diagram of the most and least sensitive direc- 
tions. Indeed, any direction can be seen as a vector field 
and a phase diagram of a subset of the parameter regime 
can be visualized. Fig. [5] depicts the most (solid) and 
least (dashed) sensitive directions which correspond to 
the stronger and weaker eigenvalues of the FIM, respec- 
tively. Notice that the most/least sensitive directions are 
parallel to the axes which asserts that the FIM is diago- 
nal. This can be explained by the fact that the param- 
eters of the model fci and k2 affect different rates in a 




0.12 
^ 0.1' 

^ 0.08 
I 0-06 
0.04 



* Numerical 
□ FIM-based 



Various directions 



FIG. 6. Upper plot: Relative entropy rate as a function 
of time for perturbations of both fci (solid line) and of k2 
(dashed line). An equilibration time until the process reach 
its metastable regime is evident. Lower plot: RER for various 
directions. The most sensitive parameter is fci. 
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FIG. 7. Configurations obtained by eo-perturbations of the 
most and least sensitive parameters. The comparison with 
the reference configuration reveals the differences between the 
most and least sensitive perturbation parameters. 



decoupled fashion (check Table WV} . 

Finally, we note that even though we have considered 
a spatial KMC model with few parameters to assess their 
sensitivity, our emphasis is primarily on (a) the high di- 
mensionality of the process, and (b) the non-reversibility 
of the process without prior knowledge of the station- 
ary probability distribution. For such complex systems 
there appears to be no previous systematic work in the 
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literature on sensitivity analysis. 
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FIG. 8. Vector field with the most (solid arrows) and least 
(dashed arrows) sensitive directions computed from eigen- 
value analysis of FIM. The length of the arrows is proportional 
to the corresponding eigenvalue. 



VI. CONCLUSIONS 

Here we proposed a novel method for sensitivity analy- 
sis of complex stochastic dynamics, based on the concept 
of Relative Entropy Rate between two stochastic pro- 
cesses. The method is computationally feasible at the 
stationary regime and involves the calculation of suit- 
able observables in path space for the Relative Entropy 
Rate and the corresponding Fisher Information Matrix. 
The stationary regime is crucial for stochastic dynamics 
and can allow us to address the sensitivity analysis of 
complex systems, including examples of processes with 
complex landscapes that exhibit metastability and strong 
intermittency, non-reversible systems from a statistical 
mechanics perspective, and high-dimensional, spatially 
distributed models. Our proposed methods bypass these 
challenges relying on the direct Monte Carlo simulation 
of rigorously derived observables for the Relative En- 
tropy Rate and Fisher Information in path space rather 
than on the stationary probability distribution itself. The 
knowledge of the Fisher Information Matrix provides a 
gradient-free method for sensitivity analysis, as well as al- 
lows to address questions of parameter identifiability and 
optimal experiment design in complex stochastic dynam- 
ics. 

Although the proposed methods are widely applicable 
to many stochastic models, we demonstrated their capa- 
bilities by focusing on two classes of problems. First, on 
Langevin particle systems with either reversible (gradi- 
ent) or non-reversible (non-gradient) forcing, highlighting 
the ability of the method to carry out sensitivity analy- 
sis in non-equilibrium systems; second, on spatially ex- 



tended Kinetic Monte Carlo models, showing that the 
method can handle high-dimensional problems. In fact, 
we showed that the proposed approach to sensitivity 
analysis is suitable for non-equilibrium systems, where 
the structure of the stationary PDF is unknown and is 
typically non-Gaussian. Finally, the sensitivity estima- 
tors can be easily embedded in any available molecu- 
lar simulation methods such as Kinetic Monte Carlo or 
Langevin solvers. 
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Appendix A: Sensitivity analysis on the 
logarithmic scale 

In many applications, the model parameters differ by 
orders of magnitude and the only meaningful option in 
order to study sensitivity analysis is to perform rela- 
tive parameter perturbations. This is done by perturb- 
ing the logarithm of the model parameters instead of 
the parameters itself. Thus, utilizing the chain rule for 
Viog9/(0) = Vgfie).Vioge0 = 0.V8fi0) where '.' means 
element by element multiplication, the logarithmic-scale 
Fisher information matrix has elements: 

{Fn{Q'°^'))^^^^e,e,{FniQ''))^^^ , ij = l,...,k. (Al) 

Similarly, the logarithmic perturbation for the RER is 
performed by utilizing the perturbation vector 9.e instead 
of e. Notice that ([7]) continuous to be valid for the loga- 
rithmic scale. Indeed, it holds that 

n (q" I Q'^(^+'') = i(e.e)^FH(Q'°"')(e.e) + Oi\6.ef) . 

(A2) 
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